Robustifying models against adversarial attacks by Langevin dynamics
نویسندگان
چکیده
Adversarial attacks on deep learning models have compromised their performance considerably. As remedies, a number of defense methods were proposed, which however, been circumvented by newer and more sophisticated attacking strategies. In the midst this ensuing arms race, problem robustness against adversarial still remains challenging task. This paper proposes novel, simple yet effective strategy where off-manifold samples are driven towards high density regions data generating distribution (unknown) target class Metropolis-adjusted Langevin algorithm (MALA) with perceptual boundary taken into account. To achieve task, we introduce generative model conditional inputs given labels that can be learned through supervised Denoising Autoencoder (sDAE) in alignment discriminative classifier. Our algorithm, called MALA for DEfense (MALADE), is equipped significant dispersion—projection distributed broadly. prevents white box from accurately aligning input to create an sample effectively. MALADE applicable any existing classifier, providing robust as well detection. our experiments, exhibited state-of-the-art various elaborate
منابع مشابه
Defense-gan: Protecting Classifiers against Adversarial Attacks Using Generative Models
In recent years, deep neural network approaches have been widely adopted for machine learning tasks, including classification. However, they were shown to be vulnerable to adversarial perturbations: carefully crafted small perturbations can cause misclassification of legitimate images. We propose Defense-GAN, a new framework leveraging the expressive capability of generative models to defend de...
متن کاملDivide, Denoise, and Defend against Adversarial Attacks
Deep neural networks, although shown to be a successful class of machine learning algorithms, are known to be extremely unstable to adversarial perturbations. Improving the robustness of neural networks against these attacks is important, especially for security-critical applications. To defend against such attacks, we propose dividing the input image into multiple patches, denoising each patch...
متن کاملDefending Non-Bayesian Learning against Adversarial Attacks
Abstract This paper addresses the problem of non-Bayesian learning over multi-agent networks, where agents repeatedly collect partially informative observations about an unknown state of the world, and try to collaboratively learn the true state. We focus on the impact of the adversarial agents on the performance of consensus-based non-Bayesian learning, where non-faulty agents combine local le...
متن کاملDecision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models
Many machine learning algorithms are vulnerable to almost imperceptible perturbations of their inputs. So far it was unclear how much risk adversarial perturbations carry for the safety of real-world machine learning applications because most methods used to generate such perturbations rely either on detailed model information (gradient-based attacks) or on confidence scores such as class proba...
متن کاملProtecting JPEG Images Against Adversarial Attacks
As deep neural networks (DNNs) have been integrated into critical systems, several methods to attack these systems have been developed. These adversarial attacks make imperceptible modifications to an image that fool DNN classifiers. We present an adaptive JPEG encoder which defends against many of these attacks. Experimentally, we show that our method produces images with high visual quality w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neural Networks
سال: 2021
ISSN: ['1879-2782', '0893-6080']
DOI: https://doi.org/10.1016/j.neunet.2020.12.024